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ABSTRACT 



A brief experimental study is undertaken to determine the utility of a 
new pilot rating scale in a fixed base tracking task. The scale is the 
nonadjectival, nonordinal, linear scale introduced by C. V. Schufeldt. 
The "subcritical" tracking task developed by Jex, McDonnell and Phatak 
is utilized in the experiment. The scale’s potential for detecting 
minor changes in system acceptability is demonstrated. 
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I. INTRODUCTION 



A.) Background 

The Cooper-Harper pilot rating scale for the evaluation of aircraft 
has found wide acceptance in the field of handling qualities research. 

The scale, shown in Figure 1, is a means of quantifying a pilot’s 
impressions of the handling qualities of an aircraft which is involved 
in a specific mission element or task. The scale is adjectival, ordinal 
and nonlinear in nature. It is adjectival in that descriptors such as 
"controllable", "adequate", and "satisfactory" appear in the flow diagram 
used by the pilot. It is ordinal in that handling qualities are ranked 
in order of decreasing acceptability. It is nonlinear in that a rating 
of, say 8, does not necessarily indicate handling qualities which are 
twice as unacceptable as those receiving a rating of 4 . The utility of 
the Cooper -Harper scale has been recently enhanced by a method for 
predicting ratings^’^. 

As successful and useful as this rating scale has been it is not 
without its weaknesses. Chief among these are its qualitative character 
and its ordinal nature. In an attempt to alleviate some of these 
difficulties, J. D. McDonnell proposed a "global" rating scale for 
handling qualities investigations. This scale, shown in Figure 2, is an 
adjectival, nonordinal, linear scale developed through the methods of 
psychometrics. While not receiving the wide acceptance of the Cooper- 
Harper scale, the Global scale has been utilized in handling qualities 
investigations^. 
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McDonnell 1 s work centered about finding the coordinates of certain 
adjectival phrases on a psychological continuum which he called the ^ 
scale. The adjectival phrases were those most commonly encountered in 
handling qualities research. 

The psychological continuum can be interprested in the following 
manner. If a measurement is made on a physical object with a nonhuman 
instrument of some sort, the measure is an objective one and the resulting 
data lie along a physical continuum. When a human observer estimates a 
measure, it is a subjective judgment and the estimates lie along a 
psychological continuum. 

C. V. Schufeldt advanced yet another rating scale. His scale, 
shown in one of its forms in Figure 3> is nonadjectival, nonordinal, 
and linear in nature. The impetus behind Schufeldt 1 s research was the 
idea of developing a scale which would reflect relatively minor differences 
in system characteristics. To accomplish this, the scale would have to 
exhibit a good deal of sensitivity without overtaxing the resolution 
capability of the operator. Schufeldt f s hypothesis was that a linear 
rating scale coincident with the psychological continuum begets such 
sensitivity. While the Global scale of McDonnell is conceptually close 
to this realization, Schefeldt felt that in certain applications, the 
adjectives were a hindrance. He wanted to know if removing the adjectives 
would allow the rater to transpose his impressions of a system directly 
to a linear, numerical index. In addition, he wondered if allowing the 
subject to fractionize his rating would increase scale sensitivity. 

Schufeldt investigated his hypothesis by submitting a child’s 
puzzle ("EVEN-STEVEN" by Kohner) to some thirty students in the Department 
of Aeronautics. Upon successful solution of the puzzle, or at the 
expiration of an allot ecb time, whichever occurred first, the subject was 
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asked to rate his impression of the difficulty he encountered in working 
the puzzle. The subjects indicated their ratings on three different 
scales, one of which is shown in Figure 3. Schufeldt found a high 
correlation coefficient (e.g. O.928 for the scale of Figure 3) between 
ratings and performance. 

B.) Critical-Subcritical Tasks 

Encouraged by Schufeldt f s results, this author was eager to use 
the scale in an environment more closely related to handling qualities 
investigations, i.e. fixed base tracking tasks. 

If Schufeldt’ s scale does indeed possess* a sensitivity superior to 
previous scales, it should yeild better results in areas where these 
scales were overly sensitive, i.e. the high end (8-10 ) of the Cooper- 
Harper scale. If the experiment is to be tractable, the task difficulty 
should be controlled by as few parameters as possible. Finally, since 
it was desired to keep the duration of the entire experimental program 
short, a task which tended to minimize training times should be selected. 
These criteria pointed toward the selection of the "critical-subcritical" 

g 

tracking tasks as pioneered by Jex, McDonnell, and Phatak . 

Critical task (first-order) refers to a special compensatory tracking 
task in which the real pole, X, of a first order controlled element 

Y (s) = — 
c s-X 

is moved slowly into the right half of the s plane until the subject or 
operator can no longer maintain control. The value of X at the onset of 
instability is called the critical instability score, X . No input is 
required since operator remnant serves to excite the system . 
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Subcritical task (first-order) refers to a similar tracking 

situation in which the value of the unstable pole, \, is kept at a 

constant and controllable value, X , throughout the run. In subcritical 

s 

tracking, a random appearing input is usually applied. Figure 4 is a 
block diagram representing the critical and subcritical systems. 



II. EXPERIMENT 

A. ) Procedure 

Fourteen subjects were chosen for the experiment. Of these fourteen, 
six were military pilots, two were civilian pilots and six were nonpilots. 

The basic experimental procedure went as follows. A subject performed 
the critical task experiment twenty times in succession. An average 
critical instability score, \ , was obtained as the mean of his five 
highest X ^ scores. Five subcritical systems were then chosen with pole 
locations given by: 



X = i ‘ 
s . 

1 






i = 1, 2, 3, 4, 5 



The subject made ten runs of fixed duration, in succession, for each of 
these systems. After each set of ten runs, the subject was asked to 
rate the system as per the instructions of Figure 5. The five subcritical 
systems were ordered randomly and this random order, once selected, was 
reversed for every operator. This means operator 1 tracked the subcritical 
systems in the order: X , X , X ,1 , X , while for operator 2 the 

s 3 s l s 4 s 5 s 2 
order was: X ,1 , X , X , X , etc. 

s 2 s 5 s 4 s l s 3 

The Measurement Systems Inc. isometric, finger grip manipulator 
was utilized for the study. The system error was displayed to the 
operator as the displacement of a horizontal line on an oscilloscope 
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screen. The system dynamics , input and mean square error circuits were 
mechanized on a small analog computer. Table I summarizes the experimental 
setup. Figure 6 shows the layout. 

B.) Discussion 

The parameters of Table I were selected to coincide as nearly as 
possible with those of similar experiments conducted by Systems Technology 

n 

Inc. (STl) . Due to equipment limitations, the sum of only two sinusoids 
was used as an input for the subcritical task. Their magnitudes and 
frequencies were chosen to coincide with those of the two lowest 
frequency sinusoids used by STI. Were the controlled element, Y (s), 
stable, the sum of just two sinusoids would probably not appear random 
enough to ensure compensatory behavior. However, the open loop instability 
made it very difficult for the operator to utilize anything but error 
information in tracking. 

In view of the large number of runs in a single experiment (20 critical. 
+ 50 subcritical = 70 runs) it was decided to reduce the subcritical run 
lengths from an original 100 seconds to 50 seconds. Early experiments 
with the 100 second lengths resulted in considerable operator fatigue and 
poor performance. The shorter run lengths, however, probably decreased 
the accuracy of mean square error scores. 

A brief comment on the rating instructions of Figure 5 is in order. 

At no time was the subject explicitly instructed to associate a particular 
scale value with a particular system. In addition, each time the subject 
was asked to evaluate a system, he was given a clean rating sheet. 
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III. RESULTS 



Figure 7 summarizes the experimental results. A set of typical 
time histories is shown in Figure 8. Table II gives the performance 
and ratings of the fourteen test subjects. The error scores for the 
first four subjects were deleted since poor analog scaling caused these 
values to be inaccurate. 

The correlation coefficient for the rating vs. X/\ c data is 0.73 
as shown in Figure 7. The mean ratings are seen to fall quite close to 
the regression line. Regression analysis of ratings vs. performance was 
hampered because of the fact that in five of the subcritical configurations 
the operators lost control in at least eight of the ten runs. It was 
difficult to quantify this performance and relate it to that obtained 
when control was maintained for the full 50 seconds. Hence no further 
analysis of the error scores beyond that shown in Table II has been 
presented. 



IV. CONCLUSIONS 



a. ) It does appear that the human operator can transpose his 
impressions of a system directly to a linear numerical index. The lack 
of adjectives does not appear to detract from the operator’s ability to 
generate subjective opinion. 

b. ) The ability of the subject to utilize the linear, nonadjectival 
scale does not appear to depend upon previous experience with rating 



6 



scales in general. The test subjects ranged from the decidedly non- 
technical (the author ! s wife) to Navy carrier pilots in the Department 
of Aeronautics. 

c.) The scale appears reasonably sensitive, i.e. the mean ratings 

axe seen to range from 2.9 to 8.4 (55 1o of the rating scale) as x/X c 

ranges from l/6 to 5/6 (66.7^ of \/\ scale). The standard deviations 

of the ratings are fairly uniform across the X/X c scale. This indicates 

constant sensitivity along the rating scale which is a characteristic of 

3 

the psychological continuum . 

It must be emphasized that the rating scale investigated here is 
not offered as a replacement for the highly successful Cooper-Harper 
scale. This should be obvious. However, there may arise instances 
when one desires to detect, in a relative sense, minor changes in system 
acceptability. In such instances, adjectival scales are simply not 
appropriate since they lack the necessary sensitivity or overtax the 
operator T s resolution capability. In these cases, a scale such as the 
one investigated here may prove useful. 
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Favorability of Handling Qualities 



Figure 2 
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A Global Rating Scale for Handling Qualities Evaluation 
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Figure 3 
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Schufeldt's Nonadjectival, Nonordinal , Linear Rating Scale 
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Figure 4 Critical and Subcritical Tracking Tasks 



SUBJECT 

DATE 



The critical task provided information regarding the limits of 
your ability to control an unstable system. Using the scale below, 
indicate the degree of difficulty you encountered in controlling the 
subcritical system checked. All the systems you will be asked to rate 
in this manner will be unstable. 

Increasing Difficulty 0> 

0123^56789 10 

I 1 i I 1 I i 1 1 li Lj Lj I 1 I 1 I 

System 1 

System 2 

System 3 

System 4 

System 5 



Figure 5 Rating Sheet for Subcritical Task 
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Figure 6 Tracking Task Equipment Layout 
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Correlation Coefficient =0.73 
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Figure 7 Ratings vs. 1/1 




Critical Instability Score k = 3*62 



Figure 8 Critical Task Stick Output and Error Signals; Subject 14 
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Figure 8 cont'd. Subcritical Task Input, Stick Output and Error Signals; 

Subject 14 
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Figure 8 cont'd. Subcritical Task Input, Stick Output and Error Signals 

Subject 14 
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TABLE I 



Critical and Subcritical Task Parameters 



Y = (s) ■ & 



X = X q + \t (Critical Task) 



X =1.0 rad/sec 



X = 0.1 rad/ sec c 



K c = control/display sensitivity 



= 0.9 cm scope deflect ion/ newton stick force 
Kp = display viewing gain for 50 cm nominal viewing distance 
= 1.0 degree visual angle/cm display deflection 
i(t) = input (Subcritical Task) 

= 0.494 sin 0.502 t + 0.460 sin 1.256 t cm 



2 

i (t) = mean square input 



= 0.23 cm 



19 



TABLE II 

Experimental Results - Ratings and Performance 
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Indicates Subject Lost Control in at Least Eight of Ten Runs 
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